Selecting and Checking Field Type
SmartZone OCR allows you to use pre-defined masking that defines the expected format of your data. If you know your data is expected to match one of these formats, specifying field type (using the Reader.FieldType property of SmartZoneOCR) will make your recognition results more accurate.
The FieldType parameter is used to assist in the recognition of text. SmartZone OCR will make greater attempts to match the result to the supported format of that field. It is very important that you select the correct field type for an image. If you do not know the format, then GeneralText is the appropriate field type to use.
If the field type determined by SmartZone OCR matches the field type you specified on input, the same value is returned in the FieldType property of the TextBlockResult. Otherwise, Unknown is returned as the FieldType value, which means the result did not confirm to the expected format of the FieldType specified on input. For example, if the Date FieldType is specified, but SmartZone OCR reads "01~23-47", then the FieldType "Unknown" will be returned.
Supported Field Types
You have the choice of field types including the following:
Language | Description |
Currency | The supported currency symbols, currency punctuation, and digits: $ ¢ £ ¥ € , . ` - = 0123456789. Supported formats include currency symbols in front of the digits, with comma and periods as separator characters and decimal separator. The € symbol may also be placed to the right of the rightmost digit. |
Currency Plus | The supported alphabetic abbreviations for currency symbols, currency punctuation, and digits: USD GBP EUR E DKK Dkr KR NOK Nkr SEK Sk $ ¢ £ ¥ € , . ` - = 0123456789. |
Data Validation Lists |
A data validation list is a set of possible expected results. The advantage of using data validation lists as a field type is to improve recognition results by narrowing the possible answers returned by character recognition in the event of an ambiguity or conflict. An example of a data validation list is a list of two character US State abbreviations, from AL to WY. See Define and Edit Data Validation Lists for more information. |
Date |
MM-DD-YY MM-DD-YYYY MM/DD/YY MM/DD/YYYY DD-MM-YY DD-MM-YYYY DD/MM/YY DD/MM/YYYY YY-MM-DD YYYY-MM-DD YY/MM/DD YYYY/MM/DD |
The local-name and the domain name will be evaluated separately, using the @ as the delimiter. Each may use any of these ASCII characters:
| |
General Text | All the supported characters in English, French, Spanish, Italian, German, Dutch, Portuguese, Norwegian, Finnish, Danish, and Swedish. |
Regular Expression |
A regular expression is a pattern in the form of a string that describes or matches the format of expected results, according to certain rules. The advantage of using regular expressions as a field type is to improve recognition results by narrowing the possible answers returned by character recognition in the event of an ambiguity or conflict. A number of SmartZone's built-in field types already use regular expressions. For example, US Social Security Number is a regular expression of the form \d{3}-?\d{2}-?\d{4} See Regular Expressions for more examples and detailed syntax. |
Social Security Number |
999-99-9999 |
Time |
HH:MM:SS HH.MM.SS HH:MM:SS am/pm HH.MM.SS am/pm Where:
|
United States Phone Number |
digits 0-9, ( ), /, EXText Where phone numbers are formatted with or without the 1 and with or without the area code. 1 (999) 999-9999 (999) 999-9999 999-9999 1 (999) 999/9999 999-999-9999 999/999/9999 999-999/9999 Use ext, EXT, X, or x as the extension indicator, follow with two to four digits (the extension number) to the right of it. |
URL |
http://www.name Supported extensions include:
|
For best recognition accuracy results, set the character set to the narrowest set possible that includes all possible returned values, then indicate the expected formats of recognition results by applying the field types listed here. Field types are used to improve recognition by defining the number of characters/digits and the formats of expected results, allowing it to choose more wisely from several possible returned values. |